Khmer Spell Checker
نویسنده
چکیده
Khmer is the official language of Cambodia. It is a complex language. Similar to Chinese, Japanese and Thai, Khmer words are written without spaces or other word delimiters. This is a major challenge in spell checking Khmer since there is no simple way to determine word boundaries. However, it is feasible to spell check Khmer. The process of spell checking Khmer is different from the spell checking process in other languages that have word delimiters like English. In Khmer, words are constructed from root words that are made up of consonantal clusters, which can be misspelled. In order to do the spell checking, first we need to find the approximate clusters of each input clusters. We then give the possible sequences of the consonantal clusters to a hidden Markov model. The model will give the score of every sequence of consonantal clusters. Based the possible sequences and their scores, we know the word boundaries, whether or not a word is correctly spelled and some alternative words if it is misspelled.
منابع مشابه
ویرایشگر متن شریف: سامانۀ ویرایش و خطایابی املایی زبان فارسی
In this paper, we will introduce an intelligent system to edit and spell check Persian texts. The goal is editing and preprocessing Persian texts for natural language processing tasks. This system is based on an expandable and engineering approach and is composed of three subsystems: Persian text editor, spell checker and stemmer. These parts interact with each other to edit texts. To do this, ...
متن کاملA Survey of Spelling Error Detection and Correction Techniques
Spelling Correction is a process of detecting and sometimes providing suggestions for incorrectly spelled words in a text. Spell Checker is an application program that flags words in a document that may not be spelled correctly. Spell Checker may be stand-alone capable of operating on a block a text such as word processor, electronic dictionary. When some text is given as an input to spell chec...
متن کاملDesign and Implementation of Punjabi Spell Checker
Spellcheckers are the basic tools needed for word processing and document preparation. Designing a spell checker for Indian languages such as Punjabi poses many new challenges not found in English, which complicates the design of the spell checker. Punjabi language is far different from Western languages in phonetic properties and grammatical rules. Thus the existing algorithms and techniques t...
متن کاملContext Sensitive Query Correction Method for Query-Based Text Summarization
Contextual spell correction is very important for real word error correction. It gives the correct word for an incorrect word in a particular sentence. The traditional spell checker can correct those misspelled words which are not present in dictionary but here we try to develop a spell checker which can give appropriate word on the basis of the contextual meaning of the sentence. This spell ch...
متن کاملImproving Finite-State Spell-Checker Suggestions with Part of Speech N-Grams
We demonstrate a finite-state implementation of context-aware spell checking utilizing an N-gram based part of speech (POS) tagger to rerank the suggestions from a simple edit-distance based spell-checker. We demonstrate the benefits of context-aware spellchecking for English and Finnish and introduce modifications that are necessary to make traditional N-gram models work for morphologically mo...
متن کامل